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^ (54) Title: METHOD OF PROCESSING A TEXT, GESTURE, FACIAL EXPRESSION, AND/OR BEHAVIOR DESCRIPTION 
^ COMPRISING A TEST OF THE AUTHORIZATION FOR USING CORRESPONDING PROHLES FOR SYNTHESIS 

(57) Abstract: The invention relates to a method of processing a text, gesture, facial expression, and/or behavior description wherein 
the text, gesture, facial expression, and/or behavior description is synthesized by means of a speech, gesture, facial expression, and/or 
behavior profile, provided that an authorization code legitimizes the use of the speech, gesttire, facial expression, and/or behavior 
profile for the synthesis. 
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METHOD OF PROCESSING A TEXT, GESTURE, FACIAL EXPRESSION, AND/OR BEHAVIOR 
DESCRIPTION COMPRISING A TEST OF THE AUTHORIZATION FOR USING CORRESPONDING 
PROFILES FOR SYNTHESIS. 



The invention relates to a method of processing a text description and is of 
importance for the technical field of text and speech synthesis. By analogy, however, the 
invention also relates to a method of processing a gesture, facial expression, and/or behavior 
description and is accordingly also of importance for the synthesis of image information. 
5 Speech synthesis systems, i.e. systems for converting written text into spoken 

language, play a part in many applications. Examples are telephonic information or 
transaction systems in which system replies arranged in text form are to be read out first to a 
user. Concrete examples are systems for timetable or stock exchange share price information 
and for buying tickets or shares. Further applications are found in the so-termed "unified 
10 messaging" systems which serve to render possible the access to documents via several 

media such as, for example, the PC, telephone, picture telephone, and fax machine. Here, too, 
documents available in written form must be read to a user accessing through the telephone. 

The flexibility in particular of automatic dialogue systems, such as the 
telephonic information or transaction systems, can be further enhanced beyond the speech 
1 5 synthesis in that a text synthesis is connected upstream of the speech synthesis systems. In 
this case, the information to be provided by the system is initially available only in the form 
of purely semantic information units, which are subsequently converted into a text by the text 
synthesis system through a concrete choice of the speaking style, for example elaborate or 
restricted, in many or few words, of the vocabulary, and/or of other characteristics such as, 
20 for example, the level of politeness and/or special characteristics of the application. 

Thus, for example, a telephone information system may first react to a user 
request by retrieving fi-om a database the infomiation (German example) "place: Munich, 
prefix: 089, Christian name: Susanne, family name: Meyer, sex: female, telephone number: 
45446538" and make these data available to the text synthesis system. This system may then 
25 form therefrom, for example, the sentence: " The telephone number of Mrs. Susanne Meyer 
requested by you is 45, 44, 65, 38. The prefix is Munich, 089." 



wo 02/099784 PCT/rB02/02051 

2 

Such a subdivision of the work among the respective semantic components of 
an automatic dialogue system, the text synthesis, and the speech synthesis considerably 
enhances the flexibility of the system design and the system maintenance if changes become 
necessary later, and is accordingly widely used in practical systems. This separation thus 
5 renders it possible, for example, to transfer such a system much more easily to a new 
language, because such a transfer does not affect, for example, the semantic components. 

To achieve compact formulations, the basic information made available as 
input data for the text and speech synthesis is denoted text description, and the concept of 
speech synthesis is extended in the sense that it also comprises the step denoted text synthesis 

10 above. Accordingly, the expression "language profile" embraces the totality of the further 
information sources used for the synthesis by the speech synthesis systems, such as, for 
example, the elements mentioned above of speech style, choice of words, level of politeness, 
and application characteristics. Depending on the embodiment, however, the text description 
may comprise not only purely semantic information but also indications, for example, as 

15 regards text synthesis. In an extreme case the text description may thus already indicate, for 
example, the text itself, in which case the language profile essentially comprises no more 
than the intonation and the voice used for the speech synthesis. The concepts of text 
description and language profile should accordingly comprise all these possible embodiments 
in the present context. 

20 Besides the speech synthesis of text descriptions, an image synthesis relating 

to gestures and/or facial expressions is also possible. Thus it is useful, for example, to reduce 
the quantity of data to be transmitted in the case of picture telephony in that the images are 
transmitted in a strongly compressed form. High compression rates can be achieved here 
through the use of image recognition at the transmitting side and image synthesis at the 

25 receiving side, because merely an image description is to be transmitted. It is necessary here, 
by analogy to the speech synthesis, that gesture and facial expression profiles are present at 
the receiving side, analogous to the speech profile, the characteristics of which are used for 
the synthesis. The facial traits and body contours of a certain person may then form part of 
such a gesture and facial expression profile. The profiles, however, may also comprise further 

30 characteristics of the person as regards gestures and facial expression, such as, for example, 
the manner in which this person smiles and carries out a certain hand movement. The 
synthesis methods then attempt, for example, to bring the synthesized speech, gestures, and 
facial expressions as close as possible to the real expressions of this person. 
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In a further step, behavior can be synthesized in addition to speech, gestures, 
and facial expressions. An example of behavior is the problem solving strategy of a person or 
a system, i.e., for example, the way in which a railway information system asks for the data 
required for giving information relating to starting point, destination, and time of the desired 
5 connection. Behavior thus typically extends over a longer time space of the interaction 

between a system and a user. It can be synthesized from a behavior description by means of a 
behavior profile in the same way as language, gestures, and facial expressions. 

The text, gesture, facial expression, and/or behavior descriptions required for a 
synthesis may be put in directly or may be obtained through a recognition process from 
10 speech and/or picture signals. Furthermore, it is also conceivable to generate automatically a 
fitting gesture and/or facial expression description for a text description which is available. In 
the case of questions, for example, the eyebrows could be raised, and the index finger could 
be raised in the case of exclamations. Suitable gesture and/or facial expression indications 
could also be integrated in the text description, if so desired also in combination with control 
15 indicators for the speech synthesis. Inversely, suitable texts could be generated so as to fit 
certain gesture and/or facial expression events. A surprised facial expression might thus 
generate the exclamation "oh!". It is furthermore obvious that there are close relationships 
also between behavior, speech, gestures, and facial expressions. 

A cooperative behavior is thus typically accompanied, for example, by polite 
20 formulations, friendly gestures, and a smiling face. 

These examples show that text, gesture, facial expression, and/or behavior 
descriptions can be synthesized by means of speech, gesture, facial expression, and/or 
behavior profiles in many ways, for example, a text description may thus be utilized both for 
a speech synthesis and for a gesture and/or facial expression synthesis. 
25 Whose speech, gesture, facial expression, and/or behavior profile is used for 

such a synthesis may depend on the nature and purpose of the description to be synthesized. 
If, for example, a picture telephone connection is to be established between two participants, 
it will be normal practice that the text, gesture, facial expression, and behavior descriptions 
are synthesized at the receiving side by means of the profiles of the respective person at the 
30 transmitting side. A further conceivable application, however, is the transmission of 

acoustically and visually animated electronic greeting cards. For example, a sender could 
himself record a birthday song composed by himself for this purpose for one of his friends, 
but choose the profile of a famous singer for the performance of the song at his friend's end. 
At the receiving side, therefore, text, gesture, facial expression, and behavior descriptions of 
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the sender could be converted into an audiovisual representation which shows the famous 
singer performing the text of the sender with the gestures, facial expressions, and behavior of 
the sender. The gestures, facial expressions, and behavior of the singer could alternatively be 
used as well for the presentation of the text of the sender, in dependence on the wishes of the 
5 sender or also of the recipient. 

Artificially created profiles may be used as well as the profiles of real persons. 
It is thus usual, for example, in certain Internet applications to guide the user by means of 
artificial characters, so-called avatars. Such artificially created profiles may also be used for 
the synthesis of text, gesture, facial expression, and/or behavior descriptions. 

10 US 6,035,273 discloses a speech communication system which manages to 

operate with a low data transmission rate. To achieve this, the spoken utterances of the sender 
are supplied to a speech recognition system at the transmitting side and are converted into 
text in this manner (speech to text). The text is transmitted to the receiving side, which 
requires a low data rate of approximately 160 to 300 bits per second. At the receiving side, 

15 the text is speech-synthesized by means of the speech profile of the sender, i.e. it is converted 
into spoken language again (text to speech). 

US 6,035,273 mentions several possibilities for making the language profile of 
the sender available at the receiving side. Thus, for example, the sender may send his 
language profile along with the text. Alternatively, however, the language profile may be 

20 stored in a device which is connected to a network (remote CTD, incorporated into a 

switching system or other network element) and may be called up by the receiver via the 
network. Furthermore, the language profile may also be separately transmitted to the 
receiver, for example before the transmission of the text, instead of being sent along with the 
text (transmission of speech profiles between CTDs). 

25 US 6,035,273 does indeed disclose the use of speech recognition and 

subsequent speech synthesis by means of the language profile of the sender as an efficient 
transmission technique for speech communication, but it provides no protection mechanisms 
for the use of the language profile. There is a possibility, however, of misuse of such 
language profiles, in particular in connection with the application possibilities of the 

30 language profiles of famous personalities, which leads to an obvious demand for protection 
of language profiles for reasons of personal privacy alone. For example, a well known singer 
could be prepared to make his voice, gesture, facial expression, and behavior profiles 
available against payment only, and possibly also exclusively for the representation of 
selected text, gesture, facial expression, and/or behavior descriptions. The fi"ee availability of 
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his voice, gesture, facial expression, and behavior profiles would accordingly open the door 
to a nnisuse of his artistic copyright. A similar argument holds for the protection of artificially 
created profiles, for example of avatars, whose development may have involved a 
considerable expenditure. 
5 It is accordingly an object of the invention to provide a method of processing a 

text, gesture, facial expression, and/or behavior description of the kind mentioned in the 
opening paragraph which pays due regard to the interests of the owners of the speech, 
gesture, facial expression, and/or behavior profiles to be used for the speech, gesture, facial 
expression, and/or behavior synthesis, i.e. to their interests in a controlled use of their 
10 profiles. 

This object is achieved on the one hand by means of a 

- method of processing a text, gesture, facial expression, and/or behavior description 
wherein the text, gesture, facial expression, and/or behavior description is synthesized by 
means of a speech, gesture, facial expression, and/or behavior profile, provided that an 

IS authorization code legitimizes the use of the speech, gesture, facial expression, and/or 
behavior profile for the synthesis, 
and on the other hand by means of a 

- system for processing a text, gesture, facial expression, and/or behavior description 
wherein the text, gesture, facial expression, and/or behavior description is synthesized by 

20 means of a speech, gesture, facial expression, and/or behavior profile, provided that an 
authorization code legitimizes the use of the speech, gesture, facial expression, and/or 
behavior profile for the synthesis. 

The introduction of an authorization code into the method renders it possible 
to test whether a user of the method is authorized to use the speech, gesture, facial 
25 expression, and/or behavior profile for the purpose of synthesis. The interests of the owner of 
the speech, gesture, facial expression, and/or behavior profiles to be used for the speech, 
gesture, facial expression, and/or behavior synthesis in a controlled use of said profiles are 
safeguarded thereby. 

Claim 2 relates to an embodiment of the method in which it is tested whether 
30 the authorization code legitimizes the use of the speech, gesture, facial expression, and/or 

behavior profile for the synthesis of an actually present text, gesture, facial expression, and/or 
behavior description. Claim 3 describes an embodiment in which an authorization code may 
be used for the authorization of a given number of cases. This includes, for example, the case 
in which a user of the method has obtained an authorization code which allows him to use the 
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speech, gesture, facial expression, and behavior profiles of a famous singer five times, or also 
the case in which the user is allowed to use these profiles only twice for the synthesis of a 
single text, gesture, facial expression, and/or behavior description characterized by the 
respective authorization code. 
5 Claim 4 describes an embodiment in which the authorization code is used in 

an encrypted form in the method. This may be used on the one hand as a security measure in 
the transmission of the authorization code via a network so as to prevent an unauthorized 
copying of the authorization code by third parties, or at least to render this copying more 
difficult. On the other hand, it gives rise to the possibility, for example, to safeguard the 

10 integrity of the interrelationship between authorization code and text, gesture, facial 
expression, and/or behavior description and speech, gesture, facial expression, and/or 
behavior profile in the case of an authorization code which, according to claim 2, legitimizes 
only the synthesis of a concrete text, gesture, facial expression, and/or behavior description, 
i.e., for example, the use of the authorization code for a spurious text is prevented or at least 

15 rendered more difficult 

These and further aspects and advantages of the invention will be explained in 
more detail below with reference to the embodiments and in particular with reference to the 
appended drawings, in which: 

Fig. 1 shows an embodiment of a system arrangement in which a method 

20 according to the invention for processing a text, gesture, facial expression, and/or behavior 
description can be implemented, 

Fig. 2 diagrammatically shows the sequence of steps for obtaining an 
authorization code for speech synthesis of a text by means of the language profile of a well 
known personality in the form of a flowchart, 

25 Fig. 3 diagrammatically shows the sequence of steps involved in the use of an 

authorization code for speech synthesis of a text by means of the language profile of a well 
known personality and the transmission of the synthesized text to a target telephone number 
in the form of a flowchart, and 

Figs. 4a, 4b diagrammatically show the sequence of steps for creating, 

30 transmitting, receiving, and speech-synthesizing a text by means of the language profile of 
the author of the text in two flowcharts. 

Fig. 1 shows an embodiment of a system arrangement in which a method 
according to the invention for processing a text, gesture, facial expression, and/or behavior 
description can be implemented. Various components are shown therein which offer various 
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possibilities for using the method according to the invention. Said components may 
accordingly be all present collectively or only in part, depending on the individual 
embodiment. 

The appliances shown in Fig. 1 are interconnected by means of wires or in a 
5 wireless manner via a network 20, for example the Internet or a telephone network. The 
interconnection may be present over a longer period, or it may be established temporarily 
only, as required, as is the case, for example, with a telephone conversation. 

Four appliances are shown by means of which a user may be connected to the 
network 20: a public user terminal 10, a laptop 50, a PC 60, and a telephone 70. The public 
10 user terminal 10 has a display 11, a keyboard 12, a microphone/loudspeaker combination 13, 
an insertion slot 14 for a chip card, and a processing unit 42. Furthermore, processing units 
40 ... 41 and data memories 30 ... 31 are connected to the network 20. 

Possible utilization scenarios according to the invention of the system 
arrangement shown in Fig. 1 will now be described with reference to the flowcharts shown in 
15 the subsequent Figures. 

Fig. 2 shows the sequence of obtaining an authorization code for speech 
synthesis of a text by means of the language profile of a well known personality in the form 
of a flowchart. In start block 101, a user makes contact with a system according to the 
invention for text processing. For this purpose, for example, he establishes a communication 
20 link to a processing unit 40 ... 41 via the Internet 20 by typing a corresponding Internet 

address on his home computer 60. The processing unit 40 ... 41 then asks him in block 102 
what he wants to do. 

The further control branches off after the decision block 103 depending on 
what the user wants. It is assumed here that the user wants to obtain an authorization code 
25 which allows him to synthesize a certain text into speech several times with the language 

profile of a well known personality XY. In this case, the control switches to block 105, in all 
other cases to a further block 104 not discussed here and dealing with the other user 
possibilities. 

The processing unit 40 ... 41 asks the user in block 105 to communicate the 
30 name of the envisaged well known personality. It is then tested in block 106 whether said 
personality is known to the processing unit 40 ... 41. If this is not the case, an alternative 
sequence follows in block 107 which is not discussed further here. In the opposite case, the 
control switches from block 106 to block 108. 
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The user is now requested in that order in block 108 to enter the text and the 
number of times he wants to use the language profile of the well known personality, and also 
to pay the fee for the desired authorization code. This payment may take place, for example, 
through the input of a credit card number. If a public user terminal 10 is used instead of the 
S PC 60, it may also be done through the insertion, for example, of a cash card into the slot 14 
of the public user terminal 10. The text may be an own composition of the user, which is then 
transmitted to the processing unit 40 ... 41 in block 108. Alternatively, it is conceivable that 
the user selects a text prepared in one of the data memories 30 ... 31 in dialogue with the 
processing unit 40 ... 41. In that case, for example, it may concern a generally known 
10 birthday song. 

Subsequently, the processing unit 40 ... 41 generates an authorization code for 
the desired application, for example in the form of a unique random sequence of digits, stores 
it in one of the data memories 30 ... 31, and communicates it to the user. If the text was 
communicated to the processing unit 40 ... 41 by the user, this text may also be stored in one 

15 of the data memories 30 ... 31. The necessary management data are also stored in one of the 
data memories 30 ... 31, indicating the interrelationships between the authorization code, the 
text, and the language profile of the well known personality, and also specifying the number 
of times the authorization code may be used. 

The user is finally asked for any further wishes in block 1 10. The user reply is 

20 evaluated in block 1 1 1. If the user has expressed further wishes, the control is passed back to 
block 103 again for further processing. If he has no further wishes, the interaction is 
terminated in end block 1 12. 

Fig. 3 diagranmiatically shows the sequence of utilization of an authorization 
code for speech synthesis of a text by means of the language profile of a well known 

25 personality and the transmission of the synthesized text to a telephone number, in the form of 
a flowchart. The blocks 201, 202, 203, and 204 correspond essentially to the blocks 101, 102, 
103, and 104 of Fig. 2. It is assumed here, however, that the user wants to utilize an 
authorization code previously obtained, for example in accordance with the sequence of Fig. 
2, so as to synthesize the text into speech by the language profile of the requested well known 

30 personality, whereupon it is sent to a telephone number of his choice. In order to contact one 
of the processing units 40 ... 41, for example, he may use a laptop 50, a PC 60, or a telephone 
70. Alternatively, however, he may use a public user terminal 10 and then communicate, for 
example, with the local processing unit 42 thereof, which unit may make contact with one of 
the processing units 40 ... 41 via the network 20 whenever required. 
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In block 205, the processing unit 40 ... 41, 42 asks the user to enter the 
authorization code. Depending on the system construction and application, a request may also 
be made for entering an unequivocal characterization of the well known personality and/or of 
the text. It is assumed here, however, that the authorization code was obtained in a scenario 
5 in accordance with Fig. 2 and belongs uniquely to a well known personality and a certain 
text, which interrelationship is stored in the system, for example in one of the data memories 
30 ... 31. Alternatively, the authorization code could characterize only the language profile 
and, for example, could allow the once-only synthesis of any text as desired which is shorter 
than a maximum length. In that case the text to be synthesized should obviously be made 

10 known to the processing unit 40 ... 41, 42 at this point. 

Block 206 tests the validity of the authorization code for the respective 
application. Depending on the application, this test involves the questions as to whether the 
authorization code belongs to the desired language profile, whether it belongs to the text to be 
synthesized, and whether the authorization code has not yet expired, i.e. whether it was not 

15 already utilized the envisaged maximum number of times. 

If the authorization code is invalid, a further treatment of the interaction not 
discussed here takes place in block 207. In the opposite case, the processing unit 40 ... 41, 42 
requests the user in block 208 to indicate the telephone number to which the speech- 
synthesized text is to be sent. The synthesis of the text into speech also takes place in this 

20 block, as well as the selection of the telephone number and the reading of the speech- 
synthesized text to a recipient answering the relevant telephone number. 

This block may in particular be embodied in a plurality of ways. For example, 
the processing unit 40 ... 41, 42 may first connect the user with the recipient so that the user 
can ascertain whether the correct recipient has answered the telephone call and can announce 

25 to the latter that the processing unit 40 ... 41, 42 will now read a message to him/her. The user 
may then obtain a connection to the processing unit 40 ... 41, 42 again by pressing a special 
button, for example if a telephone 70 is used. Instead, the processing unit 40 ... 41, 42 may 
itself read an introductory text to the participant. If no participant answers the telephone 
initially, the processing unit 40 ... 41, 42 may make a fresh attempt to achieve a connection 

30 with the envisaged telephone number after a delay period. Alternatively, for example, the 
user may determine that the speech-synthesized text is sent as an audio attachment of an e- 
mail to an e-mail address in this case, which address should then obviously have to be 
communicated to the system. 
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After block 208, the further processing steps 210, 211, and 212 again 
essentially correspond to those of blocks 1 10, 1 1 1 , and 1 12 of Fig. 2. 

Figs. 4a and 4b diagranimatically show the sequence of preparing, sending, 
receiving, and speech-synthesizing a text by means of the language profile of the author of 
5 the text in the form of two flowcharts. 

Fig. 4a first shows the preparation and transmission of a text together with the 
language profile of the author of the text in the form of an e-mail. After the start in block 301, 
the text is prepared in block 302. An authorization code is subsequently generated in block 
303, characterizing the text and the author's language profile. To generate such an 
10 authorization code, for example, the text and the language profile may be represented as a 
moving bit sequence which is subsequently imaged onto a number of manageable size by 
means of a hashing procedure. To assure the integrity of the interrelationship between text, 
language profile, and authorization code, the authorization code is additionally encrypted in 
block 303 in an asymmetrical encrypting process such as, for example, the RSA algorithm 
15 with the private key of the author of the text. 

In block 304, the text is sent together with the language profile of the author, 
the encrypted authorization code, and a unique identifier of the author in the form of an e- 
mail. The process then ends in block 305. 

Fig. 4b shows the reception and speech synthesis of the received text with the 
20 language profile of the author of the text. After the start block 310, an e-mail is first received 
in block 311 containing a text, the language profile of the author of the text, an encrypted 
authorization code, and a unique identifier of the author. Also in block 31 1, the text, language 
profile, encrypted authorization code, and author identifier are provided to a text processing 
system according to the invention. 
25 This system tests in block 312 whether the authorization code is valid. For this 

purpose, the encrypted authorization code is deciphered with the open key of the author of 
the text. If this deciphering step is successful, and the deciphered authorization code 
corresponds to the hash value of the transmitted text and language profile in the hashing 
procedure mentioned above, the authorization code is valid. If the authorization code is 
30 invalid, a further treatment of the interaction with the receiver is carried out in block 313 
which is not discussed any further here. In the opposite case, the text is synthesized into 
speech in block 314 by means of the language profile, and the synthesized text is given out as 
an audio signal, whereupon the interaction with the recipient is ended in block 315. 
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In Figs. 4a and 4b, the authorization code encrypted with the private key of the 
author of the text in combination with the unique author identifier constitutes a signature 
which proves the authenticity of the text and the language profile and which is utilized by a 
text processing system according to the invention for testing the authorization of the use of 
5 the language profile for synthesizing the text into speech. 

In a modification of this scenario, a author of the text may want to allow a 
general utilization of his language profile for synthesizing his texts into speech. In this case, 
it would suffice for the author of the text to make his language profile available for storage 
once and for all to a text processing system according to the invention which is used at the 

10 receiving side, and thus only to sign his texts and send them in accordance with the prior art. 
The text processing system according to the invention employed at the receiving side then 
uses the authenticity indicator of a text from the author provided by the signature for 
allowing the use of the language profile of the author for synthesizing the text in to speech. 

The scenarios of use of a text processing system according to the invention 

15 presented in Figs. 2, 3, 4a, and 4b are merely examples from among a plurality which 
comprises many further conceivable modifications. It is accordingly noted that a strongly 
decentralized architecture with components communicating via a network 20 was described 
with reference to Figs. 2 and 3, whereas Figs. 4a and 4b instead offer a local scenario in 
which all necessary actions take place locally at the transmitting or receiving side, and a 

20 network is merely used for the transmission of an e-mail. It is obvious that these architectures 
may be mixed, i.e. in the case of Figs. 4a and 4b, for example, the language profile need not 
necessarily be transmitted along with the text if it is present in one of the data memories 30 ... 
31 instead, from where the receiving party can obtain it. 

Furthermore, reference was made only to a text processing system for keeping 

25 the description of the Figures concise. It is obvious, however, that the invention is equally 
suitable for the processing of a text, gesture, facial expression, and/or behavior description. 
The invention may also be used in conjunction with language, gesture, facial expression, 
and/or behavior recognition systems which generate the text, gesture, facial expression, 
and/or behavior descriptions independently by means of a recognition process. Reference is 

30 also made once more to the possibilities noted above that text, gesture, facial expression, 
and/or behavior descriptions can be synthesized by means of speech, gesture, facial 
expression, and/or behavior profiles in a plurality of ways. 
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CLAIMS: 



1. A method of processing a text, gesture, facial expression, and/or behavior 
description wherein the text, gesture, facial expression, and/or behavior description is 
synthesized by means of a speech, gesture, facial expression, and/or behavior profile, 
provided that an authorization code legitimizes the use of the speech, gesture, facial 

5 expression, and/or behavior profile for the synthesis. 

2. A method as claimed in claim 1, characterized in that the text, gesture, facial 
expression, and/or behavior description is synthesized by means of the speech, gesture, facial 
expression, and/or behavior profile, provided that the authorization code legitimizes the use 

10 of the speech, gesture, facial expression, and/or behavior profile for the synthesis of said text, 
gesture, facial expression, and/or behavior description. 

3. A method as claimed in claim 1 or 2, characterized in that the text, gesture, 
facial expression, and/or behavior description is synthesized by means of the speech, gesture, 

15 facial expression, and/or behavior profile if the number of times for which the authorization 
code has already legitimized the use of the speech, gesture, facial expression, and/or behavior 
profile for the synthesis is smaller than a given first number, and/or if the number of times for 
which the authorization code has legitimized the use of the speech, gesture, facial expression, 
and/or behavior profile for synthesizing said text, gesture, facial expression, and/or behavior 

20 description is smaller than a given second number. 

4. A method as claimed in any one of the claims 1 to 3, characterized in that the 
authorization code is encrypted and is deciphered in the method, wherein in particular an 
asynmietrical cryptography method, in particular the RSA method or an elliptical curve 

25 method, is used for encrypting and deciphering. 



5. A system for processing a text, gesture, facial expression, and/or behavior 

description wherein the text, gesture, facial expression, and/or behavior description is 
synthesized by means of a speech, gesture, facial expression, and/or behavior profile. 
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provided that an authorization code legitimizes the use of the speech, gesture, facial 
expression, and/or behavior profile for the synthesis. 
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